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Abstract 

Covering and packing problems can be modeled as games to encapsulate interesting social and en- 
gineering settings. These games have a high Price of Anarchy in their natural formulation. However, 
existing research applicable to specific instances of these games has only been able to prove fast con- 
vergence to arbitrary equilibria. This paper studies general classes of covering and packing games with 
learning dynamics models that incorporate a central authority who broadcasts weak, socially beneficial 
signals to agents that otherwise only use local information in their decision-making. Rather than illus- 
trating convergence to an arbitrary equilibrium that may have very high social cost, we show that these 
systems quickly achieve near-optimal performance. 

In particular, we show that in the public service advertising model of [1], reaching a small constant 
fraction of the agents is enough to bring the system to a state within a log n factor of optimal in a broad 
class of set cover and set packing games or a constant factor of optimal in the special cases of vertex cover 
and maximum independent set, circumventing social inefficiency of bad local equilibria that could arise 
without a central authority. We extend these results to the learn-then-decide model of [ ''], in which agents 
use any of a broad class of learning algorithms to decide in a given round whether to behave according to 
locally optimal behavior or the behavior prescribed by the broadcast signal. The new techniques we use 
for analyzing these games could be of broader interest for analyzing more general classic optimization 
problems in a distributed fashion. 

1 Introduction 

Set covering and packing problems are important and interesting not only from a classical optimization 
point of view, but also as a game theoretic framework for analyzing social problems in which willful agents 
are inherent cost minimizers and for solving engineering systems problems in which programmable agents 
have some degree of autonomy in seeking solutions to distributed optimization problems. In this paper, we 
model covering and packing problems as games, and we use models from learning theory to describe local 
decision making by players in these games. As opposed to previous work, we are interested in demonstrating 
convergence not to arbitrary local equilibria but to states that are low cost relative to the global optimum. 
We accomplish this by incorporating a globally-informed central authority into natural behavior dynamics. 
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Problem. Given a universe of elements with associated costs and a collection of sets of these elements, the 
minimum weighted set cover optimization problem is to choose the lowest cost subset of elements such that 
each set is represented by at least one chosen element. While this problem is NP-hard, good approximation 
algorithms exist. However, such algorithms tend to be centralized in nature and require global knowledge. 

Game. We analyze a setting in which a central authority knows a good approximation, but elements are 
modeled as only locally aware agents with cost functions representing a natural distributed game interpre- 
tation of the core optimization problem. We generalize the problem by not requiring total coverage, rather 
the importance of covering a given set is determined by its set weight. Each element i that chooses to be 
on incurs his own cost Cj, and each element i that is off pays the sum of the weights of sets he participates 
in that do not contain any other on element. If the element costs are all smaller than the set weights, then 
the cost-minimizing set of on elements is also the optimal set cover. If additionally each set is of size two, 
then this is the special case of a minimum weighted vertex cover problem. By simply redefining the cost 
structure so that i pays q if he is off and the sum of weights of fully-covered sets he participates in if he is 
on, we can interpret this new game as a packing problem with maximum independent set as a special case. 

Social and engineering applications. Our motivation for this game theoretic approach is two-fold. The 
first setting is a social one in which agents have inherent costs associated with being on or off that correlate 
with the social objective. As a concrete example, suppose government wishes to set up a network of offices, 
say homeless shelters, that perform some service to the local community. Society would like the lowest 
cost solution that adequately addresses the needs of most communities, but for political reasons it may not 
be possible to enforce an optimal solution in a top-down manner. Furthermore, individual counties have 
competing interests in that they desire their own area to be served but incur some cost by opening a shelter. 

Another motivation is the setting in which non-autonomous agents are programmed to make decisions 
based on their surroundings. The extensive literature on cooperative control has shown that in this setting 
many optimization problems can be conveniently solved in a distributed fashion by endowing agents with 
artificial individual objective functions and cost-minimizing behavior. Many of these games and dynamics 
models result in convergence to a Nash equilibrium, or local optimum. In particular, several papers have 
modeled sensor networks as a special case of our set cover game. The elements are autonomous sensors, 
and a geographic region is a set consisting of elements conesponding to sensors that could cover that region. 
A sensor that is on is charged some fixed cost, whereas a sensor that is off is charged a cost proportional to 
the number or importance of its adjacent regions that are uncovered by any other sensor. This application 
is particularly well-suited for cooperative control because sensors can only observe the behavior of other 
sensors in their neighborhoods, and the structure of the network may not be known ahead of time, making it 
impossible for a central designer to program the sensors with an optimal solution. 

Equilibrium quality and dynamics models. Much of the work on cooperative control and dynamics- 
based algorithmic game theory only guarantees that systems converge to some equilibrium. Many games, 
however, have a high Price of Anarchy (PoA), where PoA means the worst case ratio between the social 
cost in an equilibrium and that of the global optimal configuration (see Section 2.1 for its formal definition). 
The following special case illustrates that PoA is r2(n) in our set cover game. Suppose n agents (or players) 
are charged some amount c < 1 when they are on and otherwise penalized 1 for every incident uncovered 
set. Then a star graph in which vertices are agents and edges are sets has a global optimum with only the 
center on, yielding social cost c, compared to a low quality Nash Equilibrium in which only the center is off, 
yielding social cost c(?i — 1). 

The more general problem of dynamics for games with high PoA is addressed in [1, 2], in which authors 
propose three models of distributed and semi-selfish social behavior in a general repeated game setting. 
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The models share the common feature that a central authority has knowledge of some joint strategy profile 
with low social cost, and this authority broadcasts this strategy in the hopes that players will adopt their 
prescribed strategies. Specifically, the public service advertising model (PSA) of [ ] assumes that each 
agent independently has an a probability of receiving and temporarily adopting the advertising strategy. 
Those that do not receive and adopt their prescribed strategy behave in a myopic best response manner. This 
model is well-suited for an engineering systems setting, where we do not expect all components to receive 
the central authority's signal. The learning models of [ ] assume that each agent uses any of a broad class 
of learning algorithms to continually choose between acting according to their local best response move and 
their broadcasted signal. In the learn-then-decide (LTD) model, agents eventually commit to one of these 
options. These models are better motivated by a social setting where agents that are only locally aware are 
interested in exploring the advertising strategy with the hopes that it will benefit them personally. These 
papers provide high quality guarantees for particular games, including fair cost-sharing and party affiliation 
games. 

Our results. The positive theoretical guarantees about social welfare in the outcomes of the games studied 
in the advertising and learning models of [1, 2] serve as motivation to use these models in studying our 
general general classes of set cover and packing games, which apply to engineering systems applications 
such as sensor networks as well as more purely game theoretic settings. For the case where costs of agents 
and weights of sets are bounded below and above by constants, we show the following: 

Rl. In vertex cover games', we show that for any advertising strategy s"''^, 

the dynamics of agents converges to a state of expected cost 0(cost(s"'^)) in PSA and LTD models. 

R2. In set cover games, we show that for any advertising strategy s"-'^. 



where A2 is the maximum number of sets containing given two agents. 
R3. In set cover games, we show that for a specific advertising strategy s""^, 

the dynamics of agents converge to a state of cost 0(cost(s"'^)) with high probability in PSA model. 
Moreover, we present a poly-time algorithm to find such a specific s°-'^ of low cost, i.e. 



Furthermore, we emphasize that all the above convergence guarantees happen in polynomial number 
of steps in terms of the number of agents. As we mentioned earlier, without such advertising strategies, 
agents can be an inefficient equilibrium state of cost U.{n) ■ OPT, even restricted to vertex cover games 
(i.e. A2 = 1). We also discuss extensions to the case where the costs of agents and weights of sets are not 
bounded below or above by constants. 

'As mentioned earlier, a set cover game whiere each set has size 2 is called a vertex cover game, and in such games equilibria 
have natural connections to vertex covers in the graph induced by the sets (i.e. edges). 



the dynamics of agents converges to a state of expected cost 



0(A2 
0(A1 



cost(s^"')2) in PSA model 
cost(s""')2) in LTD model' 



cost(s'"^) = 0(A2logn • OPT) 



where OPT is the optimal (social) cost. 
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Related work. Achieving global coordination in distributed multi-agent systems is a central problem of 
control theory with multiple real-world applications (see [ I ' ] and references therein). More specifically, 
several papers consider game theoretic formulations of covering problems which are inspired by practical 
sensor network problems [15, 11, 14, 3]. In particular, [3] analyzes a game that is a specific case of the prob- 
lem addressed in this paper. However, [3] and many other control theory papers guarantee only convergence 
to stable states which are locally optimal. Since these games often have a high Price of Anarchy [1(J, 13], 
the results do not translate to global performance guarantees. 

A number of approaches have been explored to circumvent such bad PoA results. In [17] the authors 
assume that the authorities enjoy complete control over some fraction of the agents. Similarly, [6, 7] focus 
on the problem of identifying and controlling the influential nodes of a network. While we also use a special 
type of advertising for improved results in Theorem 4, we do not require particular control over certain 
agents. Rather, the models we use from [; , 2] incorporate strategic behavior for all agents. Another line of 
research offers stronger performance guarantees using specific learning algorithms that employ equilibrium 
selection [9, 4, 5] or cyclic behavior [8]. Unfortunately, these techniques do not yield guarantees of fast 
convergence to good states in our class of games. 

Our analysis builds on the works of [ 1 , 2], in which authors propose game theoretic models of distributed 
and semi-selfish social behavior in a general repeated game setting. The models share the common feature 
that a central authority has knowledge of some joint strategy profile with low social cost, and this authority 
broadcasts this strategy in the hopes that players will adopt their prescribed strategies. These papers provide 
quality guarantees for particular games, including fair cost-sharing and party affiliation games. By using 
these models, we do not have to make the hard choice between enforcing top-down solutions (which may 
be infeasible in both engineering systems and social settings) and poor performance guarantees. Instead, 
we show that for a broad class of covering and packing problems, incorporating mild influence from a weak 
central authority guides the system into a near-optimal state when agents are only optimizing locally. 

2 Preliminaries 

2.1 Background on General Games 

We represent a general game as a triple Q = {N, (Si), (costj)), where A'^ is a set of n players. Si is the 
finite action space of player i G N, and costj denotes the cost function of player i. The joint action space 
of the players is S = S*! x • • • x 5„. For a joint action s G S", we denote by s_j the actions of all players 
j ^ i. Players' cost functions map joint actions to non-negative real numbers, i.e. costj : S — > R.^ for 
all i € iV. In this paper, we define a social cost function, cost : S — )• M, simply as the summation of 
individual players' costs. The optimal social cost is denoted by 

OPT = min cost(s). 

Given a joint action s, the best response of player i is the set of actions that minimizes player i's cost 
subject to the other players' fixed actions s-i, i.e. 

BRi{s-.i) = argmin^g5^costi(a, s„i). 

Best response dynamics is a process in which at each time step, an arbitrary player not already playing best 
response updates his action to one in his current best response set. A joint action s € S" is a pure Nash 
equilibrium if no player i £ N can benefit from deviating to another action, namely, Sj G BRi{s-i) for 
every i £ N. 

A game Q is called an exact potential game [ 1 2] if there exists a potential function $ : S — )■ M such that 

costi(a', S-i) - costj(a, = <^{a , s_i) - $(a, s_i), 
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for all i G N, s-i S S-i, and a, a' G Si. For general potential games, only the signs of both sides of these 
equations must be equal. While general games are not guaranteed to have a pure Nash equilibrium, all finite 
potential games do and furthermore best response dynamics in such games always converges to a pure Nash 
equilibrium [12, 13]. However, the convergence time can be exponentially large in terms of the number of 
players in general. 

Two well known concepts for quantifying the inefficiency of equilibria relative to non-equilibria are 
Price of Anarchy and Price of Stability. For M{Q) the set of pure Nash equilibria of game Q, Price of 
Anarchy (PoA) and Price of Stability (PoS) are defined as 

^ , cost(s) ^ ^ . cost(s) 

PoA = max ^ ' PoS = mm ^ ' . 

s&M{g) OPT s&N{g) OPT 

2.2 Covering Game 

Given agents [n] = {1,2,..., n}, a collection of sets T C 2'"], costs Cj for i € [n], and weights Wa for 
cr € J-", we describe the covering game Q = {[n], (Si), (costj)) where actions Si = {on, off} and cost 
costj as defined in (1) for every agent i S [n]. We let be the '/c-th order' maximum degree of the 
hypergraph induced by sets. Namely, 

Afc = Afc((/):= max \{a e T : {ii, . . . ,ik} C cr,ii ^ ^ j}\ . 

ri,...,ifeG[n] 

In addition, we define 

Cmax := maxcj Cmin := mincj Wmax := maxwa Wmin ■= rainw^. 

ig[n] jg[n] uGT a^T 

Before defining the cost functions, we introduce some notation. We say a set a € T '\% 'covered' in 
joint strategy s if Sj = on for some i ^ a. Otherwise, a is said to be 'uncovered'. Denote the collection 
of sets that include agent i and are uncovered in s with T^{s), or simply when s is clear from context. 
The entire set of uncovered sets is written = Uie[n] -^i^- ^ — I^l' define c{a) := J^iea 
T' C T, define w{T') := J2aeT' ^f^- Now define the cost function of agent i is defined with respect to any 
joint strategy s G 5 as follows: 



costi(s) 



Cj if Si = on 
w{Tf) if s, = off. 



Observe that Cj expresses how much agent i prefers to cover the sets containing i. For example, if Cmax < 
Wmin, then each agent prefers to avoid the situation that there exists an uncovered sets containing her. As 
we explain in Section 5, these covering games can be interpreted as equivalent packing games. 

For a joint action (or strategy profile) s G 5, let ON(s) and OFF(s) be sets of nodes that are on and off, 
respectively. It is easy to check that the social cost has the following simple form: 

cost(s) = ^ costi(s) = c(ON(s)) + ^ \a\-w{a). (2) 

Best response convergence. Recall that best response dynamics converge to pure Nash equilibria for 
potential games. Now observe that the covering game is an exact potential game with potential function 

^{s) = c (ON(s)) + (3) 

Combining this observation with the social cost formula implies that for any s € S we have 

< cost(s) < F„,ax • Hs), (4) 

where we let F^ax be the size of the largest set i.e. Fmax = niaxcrej- \cr\. 
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Optimization and equilibrium quality. The star graph example from the introduction reveals that PoA 
in the covering game can be very large. More generally, certain covering game instances exist with PoA 
17 (n) even restricted to the simple case A2 = 1.'^ This motivates the need for efficient dynamics with better 
guai^antees than convergence to arbitrary equilibria. 

As a step in that direction, here we provide a centralized LP-rounding-based poly-time algorithm to find 
a low-cost configuration s"*^ for the covering game as follows. 

1. Solve the following Linear Programming (LP), and obtain the solution x*. 

n 

minimize q • Xj subject to Xj > 1 Vo" G J^, G [0, 1] (5) 

1 = 1 j£(T 



2. Set 



sf = on iix*>l/F^ 



I 

„ad 



Si = off Otherwise 



The following lemma proves that the algorithm is a F^ax rcmax/^t'minl -approximation one for minimizing 
the social cost (2). 

Lemma 1. The configuration s"'^ obtained from the algorithm has 

costis"'^) < F^^^lc^^Jw^i^-] ■ OPT, 
where we recall that OPT = min., cost{s). 

Proof. Let s* be the optimal configuration i.e. cost(s*) = OPT. If there exists a uncovered set a under the 
configuration s*, choose one element from a and force it to be turned on. Repeat this procedure until all sets 
are covered, and say be the resulting configuration. Now observe that cost(s^) < [cmax/w^min] • OPT. 
Therefore, it follows that 

COSt(s'"^) < Fmax-^Ci-Xi < F^ax ' COSt(s''') < F^i^xl Cma.^ / W min] ' OPT. 
i 

□ 

Under the assumption (6), this is an O(Fmax) • O-PT-approximation algorithm to the optimal social cost. 



3 Public Service Advertising 

In this section and the following one, we show that price of anarchy is avoidable in covering games even 
using best response-inspired dynamics as long as these dynamics incorporate some form of suggestion from 
a weak central authority that is aware of a high quality equilibrium. 

The first model we study in this paper is the public service advertising (PSA) model in [1] in which a 
central authority broadcasts a strategy for each agent, which some agents receive and temporarily follow. 
Player behavior is described in two phases: 

"For example, let = c for alH £ [n] and let w„ = 1 for all a £ T. Label an arbitrary set of [c] elements L, and label the 
other elements R. Define to be all sets with one element in L and one in R. It is straightforward to check that the solution with 
all L on and all R ojf is a Nash equilibrium with cost c • [c] , while the solution with all L off and all R on is a Nash equilibrium 
with cost c ■ (n — [c] ). 
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1: Play begins in an arbitrary state, and a central authority advertises joint action s G S. Each agent 
receives the proposed strategy independently with probability a G (0, 1). Agents that receive this 
signal are called receptive. Receptive agents play their advertising strategies throughout Phase 1, 
and non-receptive agents undergo best response dynamics to settle on a joint strategy that is a Nash 
equilibrium given the fixed behavior of receptive agents. We call this joint strategy s'. 

2: All agents participate in best response dynamics until convergence to some Nash equilibrium s". 

Since our covering game is a potential game and all potential games eventually converge to a Nash equi- 
librium under best response dynamics, both phases are guaranteed to terminate. Furthermore, convergence 
occurs in poly-time with respect to parameters {n, ci, . . . , c^, w{a) : a G T}.^ 



3.1 Effect of Advertising in PSA 

In this section we show that advertising helps significantly in covering games. In particular, we show that 
if the advertising strategy s"''^ has low social cost, then the cost of the resulting equilibrium is low even if 
only a small constant a fraction of the agents receive and respond to the signal. Theorem 2 formalizes the 
general result of this section, and Theorem 4 improves this result for particular advertising strategies. For 
the convenience, in this section we assume costs and weights are bounded above and below, i.e. 

Cmax := maxcj = 0(1) Cmin := min q = Wma.^ := maxWa = 0(1) Wmin := minw,^ = a;(l). 

iS[n] i£[n] adT adT 

(6) 

Theorem 2. For any advertising strategy s°''^ in the PSA model, 



0(1) • cost(s"'^) ifF^ 

0{A2) ■ COStis^'^f i/F^ax = 0(l) 



£[cost(s )] < <; , ^ ^ .^^ (7) 



Theorem 2 implies that if s"'^ is obtained from the O(Fmax) -approximation poly -time algorithm de- 
scribed in Section 2.2, the following corollary holds. 

Corollary 3. There exists a poly-time algorithm to find an advertising strategy s"-'^ for the PSA model such 
that 

Elcostls ) < < 

^ ^ - \0(A2)-OPr2 //F^ax = 0(l) 

Effective advertising. We additionally consider advertising strategies particular to our game for improved 
performance of the model. We say that advertising strategy s"''^ satisfies condition (*) if 

^.LW^'minJ ^1 _„^n.ax\=^-LW«'minJ fof all X > — — -, (★) 

A2(Fmax - 1) 



where := A*(s"'') is the smallest number of sets containing a given on element in s'*"' as the unique 
on element. We say A* is the 'core' minimum degree of on elements in s'^'^. Intuitively, the condition (*) 
means that each on element in the advertising strategy s"*^ 'solely' contributes a large number of sets to 
cover. We establish the following stronger theorem which implies that agents will reach a state of social cost 
0(cost(s"'^)) at the end of Phase 2 if s''"' satisfied the condition {-k). 



''This is because $ is bounded above and below by functions of these parameters and decreases under best response dynamics. 
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Theorem 4. For an advertising strategy s"''^ satisfying the condition (*) in the PSA model, 

cost{s") = 0(1) • costis"'^) with probability 1 - i, /fF^ax = 0(1). 

The following corollary implies that it is possible to find such an advertising strategy s"'^ of low cost. 

Corollary 5. There exists a poly -time algorithm to find an advertising strategy s"''^ for the PSA model such 
that 

costis") = 0(A2 log n) ■ OPT with probability 1 - -, /f-Pmax = 0(1)- 

n 

Proof. Here we explain how to find an advertising strategy s'*"' satisfying the condition (*) as well as being 
of low cost. Observe that any joint strategy s with A* = A*(s) > SA2 log?i for a lai^ge enough constant 
B (depending on constants Cmax/^^min, o, ^max) satisfies the condition (*). Then starting from the joint 
strategy with social cost 0{1)-0PT obtained from the algorithm in Section 2.2, one can greedily construct a 
joint strategy s"-'^ satisfying the condition {-k) with social cost 0( A2 log n) ■ OPT (greedily turning off every 
agent that is the unique on element in fewer than i?A2 log n sets). For the advertising strategy s"'^ satisfying 
the condition (*) as constructed above, the conclusion of Corollary 5 follows from Theorem 4. □ 

Proof of Theorem 2 

From (4) and F^ax = 0(1) > any sequence of best response moves increases social cost by at most a constant 
factor. All agents best respond in Phase 2, and hence cost(s") = 0(cost(s')). It suffices to bound 
cost(s'). At a high level, we do this by providing a bound (i.e. Lemma 6) on the total weight of uncovered 
sets that are not uncovered in s"-'^ and then we give a bound (i.e. Lemma 7) on the number of agents that are 
off in s"''^ but on at the end of Phase 1 . 

First, let us introduce some notation. We say two agents contained in a common set are neighbors. Let 
L and R denote the set of agents that ai^e on and off in s"*^, respectively. Let Lojf- (and Ro,,) denotes the set of 
agents in L (and R), who are off (and on) in s' . Let Fr denote the collection of sets uncovered in s°''^, and 
let Fi^ad denote the collection of sets not in Fr but uncovered in s' . Then from (2), (6) and Fmax = 0(1), 
we have 

E[cost(s')] < cost(s'^'^) +E[c(i?«„)] + F^ax • E[w{Fi,ad)] 

= COSt(s°'^) + O m\Ron\]) + 0{E[w{Fbad)]), 

where we note that 

cost(s"'^) > c{L)+w{Fr) = ^{\L\) + ^1{\Fr\). 

Therefore, the following two lemmas bounding w{Fhad) and \Ron\ leads to the desired bound on cost(s'), 
which completes the proof of Theorem 2 from cost(.s") = 0(cost(s')). 

Lemma 6. iu{Fbad) < c(L). 

Proof. Each set in F^ad should contain an off element in Loff that is best responding in s'. Hence, 

w{Ftad) = w[\J Fl) < w{Fl) < Q < c(L), 
eeL„ff eeL„ff eeL^g 

where the second inequality is from the fact that £ is best responding (i.e. its cost exceeds the total weight of 
uncovered sets including it since it chooses to be off). This completes the proof of Lemma 6. □ 
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Lemma?. 11 < 



\Tr\ + A2 • |Lp + 0(A2) • \L\ ifF^,^ = 0(1) 



Proof. Since each element r in Ron plays best response in s' , r should be contained in a set ar as the unique 
on element. We define disjoint sets R^]) and R^ such that 

R — I J r(2) 

and r G rI,]} if G J^/j. By definition of R^} , it easily follows that 

\Ri]}\ < \^r\- (8) 

(2) 

Now consider Ron . Let /^jbe the collection of 'left' uncovered sets, i.e. a G -^o/f if o"UL is a non-empty 

(2) (2) 
subset of Loff. Hence, by definition of Ron , cTt is in Toff for each r G Ron ■ This implies that 

< \^off\. (9) 
We let T*g C J-'oj be the collection of sets containing a unique element in Loff. Then, we have 

P , „ I , /A2 |iP if F„„ = 0(1) 
■fF„„ = 2 ■ 

This is because the number of sets with more than one element in Loff is bounded by A2 • \L'\^ (remember 
that each pair of agents is contained in at most A2 common sets). Clearly, there ai^e no such sets when 

We now bound the expected size of T*^. It follows that 
where we let be the collection of sets including £ as the unique element in L. Further, we observe that 



Pr[£iso#] < Pr[|{pG77 
< Pr[|{pGJ-; 



all p n ii are off}\ < Cmax/l'^min] 

all p n are receptive}] < Cmax/ii'min] 
all p n ai-e receptive}] < Cmax/iWrnin], 



where we define C T* such that no pair of sets in have common elements in R and the size of 
T^ is not too small i.e. ]J'^*] < (-Fmax — 1)A2 • \T^\- From definitions of Fmax and A2, the existence of 
such set T^ follows. Since no pair of sets in have common elements in R, the events that all p n -R are 
receptive for p G become independent with each other and each happens with probability at least a^™"". 
Therefore, 

Ft[£ is off] < PrJ]{p G 77 : all p fl i? are receptive}] < Cmax/^^min] 



i=0 
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Combining (11) and (12) implies that 



[Cmax/WminJ . , , s , ?=* , • 



= 0(A2)-|L|, (13) 

where the last equaUty is from the following proposition of which proof is presented in Appendix A. 
Proposition 8. For constant a G (0,1) and < c < d, 



1=0 ^ ' 



af"'a^ = 0{\c~\). 



Finally, combining (8), (9), (10) and (13) leads to the desired conclusion of Lemma 7, where note that 
A2 = 1 when Fmax = 2. □ 

Proof of Theorem 4 

We will use the same notation Ron and Thad as in the proof of Theorem 2. As we explain in the proof of 
Theorem 2, it suffices to prove that the social cost at the end of Phase 1 is 0(cost(s'^'^)) with probability 
1 - 1/n. 

To this end, the following lemma establishes the condition (*) ensures that all agents in L turn on with 
probability 1 — 1 /n at the end of Phase 1 . Under the event, only sets in are uncovered and the additional 
social cost incurred by agents in The lemma shows that such additional cost is at most cost(s"'^). 
Hence, this completes the proof of Theorem 4. 

Lemma 9. If the advertising strategy s°''^ satisfies the condition (★), then 

^bad = ^ <^nd ciRon) < w{Tf{) with probability 1 . 

n 

Proof. We will use the same notation in the proof Lemma 7. As in the proof Lemma 7, for any £ £ L 
there is some subset C J"* such that no pair of sets in have common elements in R and > 

> TT^^^irT^.Then as we derived in in the proof Lemma 7, 

[Cmax/^minJ /i | \ ^ -i 

Fv[£ is off] < E ( i ) (1 - «^'"^'')'"^' K-"^")' < 4' 

where the last inequality is from the condition (*). From the union bound, Fr[Loff =0]>1 — 1/n and 
hence Tbad = 0- 

Now assume the event that all nodes in L are on. Observe that for each best responding r G Ron, Cr is 
no greater than the total weight of all sets containing r as the unique on agent. Since we assume all nodes 
in L are on these sets are a subset of Fr. Further, since there is no overlap in these sets between different 
agents in Ron, we can sum over all r G Ron to derive c{Ron) < 'w{Tr). This completes the proof of Lemma 
9. 

□ 
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3.2 Extension to Unbounded Costs and Weights 



All the results and proof techniques in this paper naturally extend to general weights and costs. In particular, 
one can obtain the following theorem (analogous to Theorem 2) in the PSA model without the assumption 
(6) via calculating explicit quantities in each step in the proof of Theorem 2. 



Theorem 10. For any advertised strategy s""^ in the PSA model, 



J7\ ^( "M ^ ) ^^^2 • rCn,ax/^«minl ' Cmax/4in) ' COSt{s'"^f if F^^^ = 0(1) 
t[C0St[s )\ < \ ad ■ ■ ( 

0(|'Cmax/^minl ' Cmax/Cmin) ' COSt{^S ) 'X-^max — 2 



4 Learn-then-decide 

We now study the set cover game in the leai^n-then-decide (LTD) model of [_]. In contrast to PSA, agents 
in LTD are neither strictly receptive nor strictly best responders in the initial exploration phase, but they 
choose one of these options for the final exploitation phase: 

1: Play begins in an arbitrary state, and a central authority advertises joint action s°''^ € S. Player i is 
associated with fixed probability Pi> & (0, 1). Agents are chosen to update uniformly at random 
for each of T* time steps. When i updates, he plays s"*^ with probability pi or best response with 
probability 1 — pi. The state at time T* is denoted s' . 

2: At time T*, all agents in random order individually commit arbitrarily to sf^ or best response. Then 
agents take turns in random order playing their chosen strategy until they reach a Nash equilibrium s" 
given the fixed behavior of s""' followers. 



Effect of Advertising in LTD 

For the convenience, in this section we again assume costs and weights are bounded above and below, i.e. 
the assumption (6). The following result in the LTD model is analogous to Theorem 2 in the PSA model. 

Theorem 11. There exists aT* ^ poly{n) such that for any advertising strategy s"''^ in the LTD model, 



0(1) • costis^'') ifF^,^ = 2 



E[cost{s")] < I ^S^^> ■ > ^'^"^^^ - ^^''> . (15) 



Theorem 11 implies that if s"'^ is obtained from the O(Fmax) -approximation poly-time algorithm de- 
scribed in Section 2.2, the following corollary holds. 

Corollary 12. There exists a poly-time algorithm to find an advertising strategy s°''^for the LTD model such 
that 

'o(A2).OPT2 ;7F^ax = 0(l) 
0(1) -OPT ifF^^ = 2 

Proof of Theorem 11 



E[cost{s")] < 



To begin with, we note that while LTD differs from PSA in both phases, the proof that cost is low in Phase 
1 of LTD is very similar to the proof of Theorem 2. However, showing that the cost stays low in Phase 2 
imposes new challenges. 
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We will use the same notation as in the proof of Theorem 2. We first define £ = £ (T' , T* ) for 1 < T' < 
T* as the event that every element in L updates at least once before time T' after every element in R has 
updated at least once, and then every element in R again updates at least once at some time t € [T', T*]. 
Clearly there exist some T',T* G poly{n) such that £ = £{T',T*) happens with the following high 
probability, i.e. 

Pr[^] > 1 ^ 



Then, we have 



^[cost(s")] = Pr[£] ■ E[cost{s") \ £] + Pr[£:^] • ^[cost(s") | £"] 
< E[cost( s") \ £] + -Oin^"^-^) 

= -E;[cost(s") I ^] + 0(1), (16) 

where for the inequality we use the fact that the social cost is always bounded above by Cmax'"^+-^max'|-^| = 
O (n'^™'''') from (2) and (6). 

Therefore, it suffice to bound £'[cost(s") | £], where our choice of T* is primarily for guaranteeing 
that £ happens with such a high probability. We first bound the expected social cost at the end of Phase 1 
under the event £ as below. And later, we will bound the increase in the social cost in Phase 2. 

Lemma 13. 

. /s , ^1 f0(l) • cost(s<"^) ifF,^^^ = 2 

E\cOSt(s') \ £]< { ^ ' V y J max 

Proof. Similarly as in the proof of Theorem 2, we again note that 

cost(s') = cost{s'"^) + 0{\Ron\) + 0{w{Tkad)) (H) 
cost(s"'^) = ^{\L\) + Vi{\Fr\). (18) 

We again remind that we will use the same notation as in the proof of Theorem 2. 

Hence, it suffices to bound w{Thad) and \Ron\ in terms of |L| and \Tr\. First consider w{Tf,ad)- We 
separately analyze the weights of two types of J-'bad- First consider a set in Tbad H 2^, i.e. a set consisting 
only of elements in Loff. Suppose we attribute the weight of such a set to its element i that updated most 
recently before the end of Phase 1. Because £ € L,,^ played best response most recently, the weight of all 
sets in J'bad H 2^ attributed to i is at most q. Summing over all £ G Loff C L gives 

w{Tbadn2'')<c{L) = 0{\L\). (19) 

Now consider a set in Thad\'2^ , i-S- a set which has elements in both Log and R and all of them are off at 
the end of Phase 1. By definition of Foff, Thad\'^^ C Toff: Under assuming the event £, the proof arguments 
to bound \Toff\ in the proof of Lemma 7 identically work in the LTD model (using /? instead of q), i.e. we 
have 

From (19) and (20), it follows that 

/0(A2-|Lp) + 0(A2)-|L| ifF„,ax = 0(l) 
E IwiJ-had) \ £ \ = \ ■ (21) 

^ 10(|L|) ifF„,ax = 2 
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Now under assuming event 8, one can observe that the conclusion and proof strategy of Lemma 7 also 
works for \Ron \ in the LTD model, i.e. 

E[|;,„„||q fl^Hl+0(A2.m + 0(A.)nL| ifF„„ = 0(l) 
\|^rI+0(|L|) itF„„ = 2 

Therefore, combining these bounds (17), (18), (21) and (22) leads to the desired bound of Lemma 
13. □ 

We now bound the cost increase in Phase 2 assuming £. From (4) and F^ax = 0(1) it suffices to 
provide a bound on the expected increase in the potential function throughout Phase 2, i.e. 

COSt(s") < F^ax • Hs") + F^ax ' ("^(s') + COSt(s')) 

= 0($(s") - ^(s')) + 0(cost(s')). (23) 

The following lemma bounds the expected potential increase $ (s") — $ (s') under assuming event £. Finally, 
combining (16), (23), Lemma 13 and Lemma 14 lead to the desired conclusion of Theorem 11. 



Lemma 14. 

E[^{s") - «>(s') I £] < 



OiAj) ■ costis^'^f ifF^,^ = 0(1) 
0(1) • costis'^'') ifF^^, = 2 



Proof. Since best response moves do not increase the potential function <I>, we only consider updates of 
agents following the advertising strategy s"'^ in Phase 2. Since each 's"*^ follower' changes strategies at 
most once in Phase 2, it suffices to consider a single off -on move (following s""^) for each agent in L and a 
single on-off move (following s"'^) for each agent in Ron- For each £ £ L, an off -on movie (i.e. £ changes 
his decision from off to on) increases potential by at most q. Hence, 

the total potential increase by on-off moves is at most c{L) = 0{\L\). (24) 

Now consider another type of moves, i.e. a single on-off move for each agent in R„n. For each r € Ron 
that first turns off at time t > T*, let J> be the collection of sets containing r such that all of their other 
elements are off at time t. Then the potential increases by at most w{J^r) = 0(|J>D at time t. Hence, 

the total potential increase by off-on moves is at most 0{J2^.^j^^^^ \^r\) = O (|Urg_R„„^r|)- (25) 

We will bound the expectation of | UrGi?o„ J^r\- To this end, we consider two types of set a including an 
element in Ron- (a) a has an element £„ £ L that was on at the end of Phase 1, and (b) otherwise. Observe 
that the expected number of sets of type (b) which does not in is already bounded in the proof of Lemma 
13 by \To]f\ i.e. 



£'[the number of sets in [Jr&R.on-^r \ of type (b) | £] 
Therefore, we have 

F[the number of sets in UrG_Ro„-^r of type (b) | £\ - 



A2-|L|2 + 0(A2)-|L1 ifF,„ax = 0(l) 
0{\L\) ifF^ax = 2 



\Fr\ + A2 • \L\' + 0(A2) • \L\ if F^ax = 0(1) 
\Tr\ + 0{\L\) ifF„,ax = 2 

(26) 
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Thus, we only focus on set a of type (a). Let Ti^^n^^g be the sets containing i^j and all of their elements 
in R being off at the end of Phase 1. Our key observation here is that set a of type (a) will only possibly 
become uncovered when an r G cr n Ron turns off if all but at most [cmax/^^min] sets in Ti^^R^g have an 
element in R that updates before r updates. Otherwise, i^j have too many uncovered sets to turn off before 
r turns off (hence, it remains on). For an arbitrary updating ordering of agents in i? \ {r}, there are at 
least \Ti^^R^g\l IS.2 elements that are the first updating agent in some set p € Ti^^r^^. Therefore, for each 
r G fj n Ron, 



Pr[cr e j; I < 



[C]:nax/''^minl ~l~ 1 



Using the union bound. 



Pr[(7 G U,e/j„, \E\<F„ 



+ 1 



Now let Ti^n C be the sets containing 
E, random variable Di := \Ti 



B 



-1)A2 



G L and at least one element of R. Note also that given 
has (first-order) dominance over the binomial random variable X ~ 

. Using this, we have 



£^[the number of sets in \Jr<^R„„^r of type (a) | £\ 



E 



E 



rCmax/Wmin] + 1 



< 



< 



E 



E 



O (A2 • rCmax/lt'minl) ' E 



1 



E E 



o (A2 • rc„ 



/ If mini ) 

\L\) 



O 



[Cniax/ "U^mi 
1^1), 

where the second inequality uses the fact that E[l / (1+y)] < for binomial random variable Y 



(27) 
B{n,p). 

Finally, combining (24), (25), (26) and (27) leads to the desired conclusion of Lemma 14, where we 



remind that cost(s'"^) = n{\L\) + n{\TR\) and A2 = 1 when F„ 



□ 



5 Extension to Packing Games 

Notice that our covering games correspond to packing games if we simply redefine the costs such that i 
pays Ci if he is off and he pays the sum of the weights of fully-covered sets he participates in if he is on. 
Roughly speaking, the game strives to find a large packing, which is determined by the set of on agents, 
while avoiding fully covered sets. Since we are simply relabeling actions, all the results from the previous 
sections apply. The packing interpretation of this problem is easiest seen with the simple example where 
sets are of size 2, q = c < 1 for all i, and = 1 for all a. In the original formulation of the problem, the 
sets of on agents in Nash equilibria are minimal vertex covers. In the new formulation, the sets of on agents 
in Nash equiUbria are maximal independent sets. 

6 Conclusions 

In recent years, game theoretic frameworks have provided informative models for analyzing the outcomes 
of games among autonomous agents or components programmed as autonomous agents. However, many 
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games, including those studied in this paper, often suffer from high Price of Anarchy, meaning that without 
a central authority it is hard to induce a state with low social cost. In this paper we study how weak 
broadcasting signals from a central authority are enough to induce states with low social cost in a general 
class of covering and packing problems. In particular; we show that for any advertising strategy s"*^, games 
with constantly bounded costs and weights converge either in the public service advertising model of [ 1 ] 
or in the learn-then-decide model of [2] to a state with cost 0(cost(s'^'')^). Moreover, in both models 
we show convergence to a state of cost 0(cost(s"'^)) if all sets are of size 2. Furthermore, for particular 
and poly-time computable s"'^ in the PSA model, we guarantee convergence to a state within a O(logn) 
factor of optimum for any game with sets of constant size. We believe that the techniques introduced in this 
paper to analyze covering and packing games could be of broader interest for analyzing classic optimization 
problems in a distributed fashion. 
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A Proof of Proposition 8 

The desired conclusion of Proposition 8 for the case c < 1 follows since d{\ — = 0(1) for all d > as 
long as a € (0, 1) is constant. Hence, assume c > 1. Let a = max(a, 1 — a) and define ^ € (0, 1) to be the 
largest constant satisfying 



For each £, we have either d < c/^ord > c/^. For the case with d < c/^, observe that with c < d, the 
desired expression is at most 



2007. 




Now consider when d > c/^. Observe that 




1=0 ^ ^ 



a' < 




< d 



d' 



Further, we have 




0(c) 
0(c) 



a 



d/2 . --d/2 



where we use (a) d ■ a'^l'^ is 0(1), (b) d*/^' increasing with respect to i for i < c < d, (c) x! = r2((x/e)^), 
(d) c < ^ • d and (e) the definition of ^. This completes the proof of Proposition 8. 
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